| Age | Blood Pressure | Cholesterol Level | BMI | Sleep Hours | Triglyceride Level | Fasting Blood Sugar | CRP Level | Homocysteine Level | |
|---|---|---|---|---|---|---|---|---|---|
| count | 4986.00 | 4989.00 | 4982.00 | 4990.00 | 4988.00 | 4994.00 | 4989.00 | 4985.00 | 4989.00 |
| mean | 49.57 | 149.79 | 225.35 | 29.01 | 6.99 | 249.28 | 119.92 | 7.48 | 12.44 |
| std | 18.14 | 17.58 | 43.58 | 6.34 | 1.76 | 87.04 | 23.63 | 4.32 | 4.33 |
| min | 18.00 | 120.00 | 150.00 | 18.00 | 4.00 | 100.00 | 80.00 | 0.01 | 5.01 |
| 25% | 34.00 | 134.00 | 187.00 | 23.46 | 5.45 | 175.00 | 99.00 | 3.70 | 8.67 |
| 50% | 50.00 | 150.00 | 226.00 | 29.04 | 6.99 | 248.00 | 120.00 | 7.49 | 12.35 |
| 75% | 65.00 | 165.00 | 263.00 | 34.48 | 8.55 | 324.00 | 141.00 | 11.18 | 16.13 |
| max | 80.00 | 180.00 | 300.00 | 40.00 | 10.00 | 400.00 | 160.00 | 15.00 | 20.00 |
Heart Disease Risk Factor Analysis Dashboard
Exploring relationships between health indicators and heart disease
Overview
This dashboard analyzes a heart disease dataset from Kaggle, examining the relationships between various health indicators, lifestyle factors, and heart disease status. The analysis aims to identify key risk factors and patterns that may contribute to heart disease.
Data Source: Heart Disease Dataset on Kaggle
Program: Master of Science in Data Science, University of Colorado Boulder
This analysis used a random sample (50%) the original dataset for visualization purposes with Altair. For reproducibility, the sample is set to a random state of 0 in Pandas. df.sample(frac=0.50, random_state=0)
Summary Statistics
The table below provides summary statistics for the numerical variables in the dataset. These values give us a baseline understanding of the central tendencies and distributions of key health metrics.
Demographic & Lifestyle Factors
This section explores the categorical variables in the dataset, including gender, diabetes status, smoking habits, and other lifestyle factors that may influence heart disease risk.
Gender and Diabetes Distribution
The charts below show the distribution of participants by gender and diabetes status. Understanding these demographic factors is important as they can significantly affect heart disease risk.
Heart Disease Status and Smoking Habits
These charts display the distribution of heart disease status in the sample and smoking habits. Smoking is a well-known risk factor for cardiovascular disease, and this visualization helps us understand its prevalence in our dataset.
Blood Pressure and Cholesterol Status
High blood pressure and high cholesterol are major risk factors for heart disease. These visualizations show the distribution of participants with and without these conditions.